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STATISTICS 


14.1 Introduction 


In Class IX, you have studied the classification of given data into ungrouped as well as 
grouped frequency distributions. You have also learnt to represent the data pictorially 
in the form of various graphs such as bar graphs, histograms (including those of varying 
widths) and frequency polygons. In fact, you went a step further by studying certain 
numerical representatives of the ungrouped data, also called measures of central 
tendency, namely, mean, median and mode. In this chapter, we shall extend the study 
of these three measures, 1.e., mean, median and mode from ungrouped data to that of 
grouped data. We shall also discuss the concept of cumulative frequency, the 
cumulative frequency distribution and how to draw cumulative frequency curves, called 
ogives. 


14.2 Mean of Grouped Data 


The mean (or average) of observations, as we know, is the sum of the values of all the 
observations divided by the total number of observations. From Class IX, recall that if 
Xp Xy..., X are observations with respective frequencies f, f,,..., f, then this 
means observation x, occurs Í, times, x, Occurs i times, and so on. 


Now, the sum of the values of all the observations =/f,x,+f,x,+...+f,x,, and 
the number of observations =f,+f,+...+/f. 

So, the mean x of the data is given by 
Fix + fx Heet SaXn 

Recall that we can write this in short form by using the Greek letter & (capital 
sigma) which means summation. That is, 


X = 
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which, more briefly, is written as x = , if it is understood that i varies from 
1 ton. 


Let us apply this formula to find the mean in the following example. 


The marks obtained by 30 students of Class X of a certain school in a 
Mathematics paper consisting of 100 marks are presented in table below. Find the 
mean of the marks obtained by the students. 


Marks obtained | 10 | 20 | 36 | 40 | 50 | 56 TOT? 89 92 |95 
(x,) 





Number of 1111314 W3% T] 4 N] 2/3) 1 
students (f) 


Recall that to find the mean marks, we require the product of each x, with 
the corresponding frequency f.. So, let us put them in a column as shown in Table 14.1. 


Marks obtained (x, ) Number of students (f) ee 


PeWNrRrR FAN W BWR 
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Efx 1779 
PIX IT _ 5 3 


Now, i= ef 30 


Therefore, the mean marks obtained is 59.3. 


In most of our real life situations, data is usually so large that to make a meaningful 
study it needs to be condensed as grouped data. So, we need to convert given ungrouped 
data into grouped data and devise some method to find its mean. 


Let us convert the ungrouped data of Example | into grouped data by forming 
class-intervals of width, say 15. Remember that, while allocating frequencies to each 
class-interval, students falling in any upper class-limit would be considered in the next 
class, e.g., 4 students who have obtained 40 marks would be considered in the class- 
interval 40-55 and not in 25-40. With this convention in our mind, let us form a grouped 
frequency distribution table (see Table 14.2). 


Table 14.2 





Now, for each class-interval, we require a point which would serve as the 
representative of the whole class. /t is assumed that the frequency of each class- 
interval is centred around its mid-point. So the mid-point (or class mark) of each 
class can be chosen to represent the observations falling in the class. Recall that we 
find the mid-point of a class (or its class mark) by finding the average of its upper and 
lower limits. That is, 


Upper class limit + Lower class limit 
2 


Class mark = 


10+25 





With reference to Table 14.2, for the class 10-25, the class mark is ,1.€., 


17.5. Similarly, we can find the class marks of the remaining class intervals. We put 
them in Table 14.3. These class marks serve as our xs. Now, in general, for the ith 
class interval, we have the frequency f, corresponding to the class mark x, We can 
now proceed to compute the mean in the same manner as in Example 1. 
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Table 14.3 





The sum of the values in the last column gives us È f.x. So, the mean x of the 
given data is given by 


f xX . 
bees _ 1860.0 _ 6) 
LT, 30 
This new method of finding the mean is known as the Direct Method. 


X = 








We observe that Tables 14.1 and 14.3 are using the same data and employing the 
same formula for the calculation of the mean but the results obtained are different. 
Can you think why this is so, and which one is more accurate? The difference in the 
two values is because of the mid-point assumption in Table 14.3, 59.3 being the exact 
mean, while 62 an approximate mean. 


Sometimes when the numerical values of x, and f are large, finding the product 
of x, and f, becomes tedious and time consuming. So, for such situations, let us think of 
a method of reducing these calculations. 


We can do nothing with the fs, but we can change each x, to a smaller number 
so that our calculations become easy. How do we do this? What about subtracting a 
fixed number from each of these xs? Let us try this method. 


The first step is to choose one among the x; s as the assumed mean, and denote 
it by ‘a’. Also, to further reduce our calculation work, we may take ‘a’ to be that x, 
which lies in the centre of x, x,,...,x,. So, we can choose a = 47.5 or a = 62.5. Let 
us choose a = 47.5. 


The next step is to find the difference d, between a and each of the x,’S, that is, 
the deviation of ‘a’ from each of the x.’s. 


l.e., d, = x, -a = x, — 47.5 


l 
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The third step is to find the product of d, with the corresponding f, and take the sum 
of all the f ds. The calculations are shown in Table 14.4. 


Table 14.4 





Lid, 
> 





So, from Table 14.4, the mean of the deviations, d = 


Now, let us find the relation between d and 7. 
Since in obtaining d „we subtracted ‘a’ from each x, so, in order to get the mean 


x , we need to add ‘a’ to d . This can be explained mathematically as: 


_ Yfd, 
M f deviations, da —— 
ean of deviations rf 
F Lf. (x; —a 
So, i. 
2f, 
2fix; bfa 
> X 
2). 
z-ai 
Dfi 
= x-a 
So, X=a+d 
rad. 
i.e. z- a+ hs 
9 ps 
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Substituting the values of a, Lfd. and Xf from Table 14.4, we get 


X = 47.54? = 475 +145 = 62. 


Therefore, the mean of the marks obtained by the students is 62. 
The method discussed above is called the Assumed Mean Method. 


Activity 1 : From the Table 14.3 find the mean by taking each of x, (1.e., 17.5, 32.5, 
and so on) as ‘a’. What do you observe? You will find that the mean determined in 
each case is the same, 1.e., 62. (Why 7?) 


So, we can say that the value of the mean obtained does not depend on the 
choice of ‘a’. 

Observe that in Table 14.4, the values in Column 4 are all multiples of 15. So, if 
we divide the values in the entire Column 4 by 15, we would get smaller numbers to 
multiply with f. (Here, 15 is the class size of each class interval.) 


Xi 


a 
, where a is the assumed mean and A is the class size. 





So, let u, = 


Now, we calculate u, in this way and continue as before (1.e., find fu, and 
then Xf u). Taking h = 15, let us form Table 14.5. 


Table 14.5 





Lf u; 
Lf 


Here, again let us find the relation between m and x. 





Let u = 
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=o 
We have, u. = 
h 
(x =a) 
Lf, — E 
Therefore, y- h _1|Ym- af, 
2j. h Lf. 
1| Lf,x; a Lf, 
= h 2f. Lf 
Le. 
= —|A-— 4d 
h 
So, hu = x-a 
L.e., x=a+ hu 
Xfi 
So KL- if 
O] Ff. 
Now, substituting the values of a, h, Xfu, and Xf. from Table 14.5, we get 
2 
x = 41.5+15x (=) 
30 
= 47.5 + 14.5 = 62 


So, the mean marks obtained by a student is 62. 
The method discussed above is called the Step-deviation method. 
We note that : 


e the step-deviation method will be convenient to apply if all the d; s have a 
common factor. 


e The mean obtained by all the three methods is the same. 


e The assumed mean method and step-deviation method are just simplified 
forms of the direct method. 


e The formula x =a+ hu still holds if a and A are not as given above, but are 





any non-zero numbers such that u, = 


Let us apply these methods in another example. 
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The table below gives the percentage distribution of female teachers in 
the primary schools of rural areas of various states and union territories (U.T.) of 
India. Find the mean percentage of female teachers by all the three methods discussed 


in this section. 


Percentage of 15-25) 25-35 | 35-45] 45-55 | 55-65] 65-75 | 75 - 85 
female teachers 





Number of 11 7 4 4 2 
States/U.T. 


Source : Seventh All India School Education Survey conducted by NCERT 


Let us find the class marks, x, of each class, and put them in a column 
(see Table 14.6): 


Percentage of female Number of 
teachers States /U.T. (f) 





—50 
10 





Here we take a = 50, h = 10, then d, = x,— 50 and u;= a 


We now find d, and u, and put them in Table 14.7. 
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Table 14.7 





From the table above, we obtain Xf = 35, Lfx, = 1390, 
fd, = — 360, Lfu, = —36. 


Naif, _13004,), 


rf, 35 


Using the assumed mean method, 


Using the direct method, x 


= 39.71 


x= at 


Lhd, = 
i aes = 50 + (360) 
Lf 35 


Using the step-deviation method, 


x= at 2i xh =50 + (= ]x10 = 39.71 
pa i 35 


Therefore, the mean percentage of female teachers in the primary schools of 
rural areas is 39.71. 


Remark : The result obtained by all the three methods is the same. So the choice of 
method to be used depends on the numerical values of x, and f. If x, and f, are 
sufficiently small, then the direct method is an appropriate choice. If x, and f, are 
numerically large numbers, then we can go for the assumed mean method or 
step-deviation method. If the class sizes are unequal, and x, are large numerically, we 
can still apply the step-deviation method by taking A to be a suitable divisor of all the d.’s. 


2020-21 


The distribution below shows the number of wickets taken by bowlers in 
one-day cricket matches. Find the mean number of wickets by choosing a suitable 
method. What does the mean signify? 


Number of 20-60 | 60-100 | 100-150] 150-250] 250 - 350 | 350 - 450 
wickets 





Number of 7 5 16 12 2 3 
bowlers 


Here, the class size varies, and the x;s are large. Let us still apply the step- 
deviation method with a = 200 and h = 20. Then, we obtain the data as in Table 14.8. 


Number of Number of 
wickets bowlers 
taken (f) 


20 - 60 
60 - 100 


100 - 150 
150 - 250 
250 - 350 
350 - 450 





f —106 
So, u = a Therefore, x = 200 + 20( = = 200 — 47.11 = 152.89. 


This tells us that, on an average, the number of wickets taken by these 45 bowlers 


in one-day cricket is 152.89. 


Now, let us see how well you can apply the concepts discussed in this section! 
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Activity 2 : 


Divide the students of your class into three groups and ask each group to do one of the 
following activities. 


1. Collect the marks obtained by all the students of your class in Mathematics in the 
latest examination conducted by your school. Form a grouped frequency distribution 
of the data obtained. 


2. Collect the daily maximum temperatures recorded for a period of 30 days in your 
city. Present this data as a grouped frequency table. 


3. Measure the heights of all the students of your class (in cm) and form a grouped 
frequency distribution table of this data. 


After all the groups have collected the data and formed grouped frequency 
distribution tables, the groups should find the mean in each case by the method which 
they find appropriate. 


EXERCISE 14.1 


1. A survey was conducted by a group of students as a part of their environment awareness 
programme, in which they collected the following data regarding the number of plants in 
20 houses in a locality. Find the mean number of plants per house. 





Which method did you use for finding the mean, and why? 


2. Consider the following distribution of daily wages of 50 workers of a factory. 





Find the mean daily wages of the workers of the factory by using an appropriate method. 


3. The following distribution shows the daily pocket allowance of children of a locality. 
The mean pocket allowance is Rs 18. Find the missing frequency f. 
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. Thirty women were examined in a hospital by a doctor and the number of heartbeats per 
minute were recorded and summarised as follows. Find the mean heartbeats per minute 
for these women, choosing a suitable method. 


Number of heartbeats | 65-68 | 68-71 | 71-74 | 74-77 | 77-80 | 80-83 | 83-86 
ag minute 





. Ina retail market, fruit vendors were selling mangoes kept in packing boxes. These 
boxes contained varying number of mangoes. The following was the distribution of 
mangoes according to the number of boxes. 


Number of mangoes | 50-52 | 53-55 | 56-58 | 59 - 61 62 - 64 


Number Number ofboxes boxes 





a the mean Le of mangoes Ea ina Sa Lass Which n of finding 
the mean did you choose? 


. The table below shows the daily expenditure on food of 25 households in a locality. 


Daily expenditure |100-150 150-200 200 -250 250 - 300 300 - 350 
(in 7) 

Number of 4 À ibs 2 2 
households 


Find the mean daily expenditure on food by a suitable method. 





. To find out the concentration of SO, in the air (in parts per million, i.e., ppm), the data 
was collected for 30 localities in a certain city and is presented below: 


Concentration of SO, (in ppm) 


0.00 - 0.04 
0.04 - 0.08 


0.08 - 0.12 
0.12 - 0.16 
0.16 - 0.20 
0.20 - 0.24 





Find the mean concentration of SO, in the air. 
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8. A class teacher has the following absentee record of 40 students of a class for the whole 
term. Find the mean number of days a student was absent. 





9. The following table gives the literacy rate (in percentage) of 35 cities. Find the mean 
literacy rate. 





14.3 Mode of Grouped Data 


Recall from Class IX, a mode is that value among the observations which occurs most 
often, that is, the value of the observation having the maximum frequency. Further, we 
discussed finding the mode of ungrouped data. Here, we shall discuss ways of obtaining 
a mode of grouped data. It is possible that more than one value may have the same 
maximum frequency. In such situations, the data is said to be multimodal. Though 
grouped data can also be multimodal, we shall restrict ourselves to problems having a 
single mode only. 


Let us first recall how we found the mode for ungrouped data through the following 
example. 
Example 4 : The wickets taken by a bowler in 10 cricket matches are as follows: 
2 6 4 3 0 2 1 3 2 3 
Find the mode of the data. 


Solution : Let us form the frequency distribution table of the given data as follows: 
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Clearly, 2 is the number of wickets taken by the bowler in the maximum number 
(i.e., 3) of matches. So, the mode of this data is 2. 


In a grouped frequency distribution, it is not possible to determine the mode by 
looking at the frequencies. Here, we can only locate a class with the maximum 
frequency, called the modal class. The mode is a value inside the modal class, and is 
given by the formula: 


Mode = nla 
2fi T fi 0o Í: 2 
where l= lower limit of the modal class, 
h = size of the class interval (assuming all class sizes to be equal), 
f, = frequency of the modal class, 
Ja = frequency of the class preceding the modal class, 
f, = frequency of the class succeeding the modal class. 
Let us consider the following examples to illustrate the use of this formula. 
Example 5 : A survey conducted on 20 households in a locality by a group of students 


resulted in the following frequency table for the number of family members in a 
household: 





Find the mode of this data. 


Solution : Here the maximum class frequency is 8, and the class corresponding to this 
frequency is 3 — 5. So, the modal class is 3 — 5. 


Now 
modal class = 3 — 5, lower limit (Z) of modal class = 3, class size (h) = 2 
frequency (f, ) of the modal class = 8, 
frequency (fy) of class preceding the modal class = 7, 
frequency (f,) of class succeeding the modal class = 2. 


Now, let us substitute these values in the formula : 
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oe [foo |x 
07 eet | eed be 
= 3+ ee %2=342=3.286 
2x8-7-2 T 


Therefore, the mode of the data above is 3.286. 


Example 6 : The marks distribution of 30 students in a mathematics examination are 
given in Table 14.3 of Example 1. Find the mode of this data. Also compare and 
interpret the mode and the mean. 


Solution : Refer to Table 14.3 of Example 1. Since the maximum number of students 
(1.e., 7) have got marks in the interval 40 - 55, the modal class is 40 - 55. Therefore, 


the lower limit (l) of the modal class = 40, 

the class size (h) = 15, 

the frequency (f, ) of modal class = 7, 

the frequency (f,) of the class preceding the modal class = 3, 

the frequency (f,) of the class succeeding the modal class = 6. 
Now, using the formula: 


Mode = {Af}, 
Zi Jo = 5 


7=3 
= 40+ | ———— |x15 =52 
we get Mode (73) 5 


So, the mode marks is 52. 
Now, from Example 1, you know that the mean marks is 62. 


So, the maximum number of students obtained 52 marks, while on an average a 
student obtained 62 marks. 


Remarks : 


1. In Example 6, the mode is less than the mean. But for some other problems it may 
be equal or more than the mean also. 


2. It depends upon the demand of the situation whether we are interested in finding the 
average marks obtained by the students or the average of the marks obtained by most 


2020-21 


STATISTICS 275 


of the students. In the first situation, the mean is required and in the second situation, 
the mode is required. 


Activity 3 : Continuing with the same groups as formed in Activity 2 and the situations 
assigned to the groups. Ask each group to find the mode of the data. They should also 
compare this with the mean, and interpret the meaning of both. 


Remark : The mode can also be calculated for grouped data with unequal class sizes. 
However, we shall not be discussing it. 


EXERCISE 14.2 


1. The following table shows the ages of the patients admitted in a hospital during a year: 





Find the mode and the mean of the data given above. Compare and interpret the two 
measures of central tendency. 


2. The following data gives the information on the observed lifetimes (in hours) of 225 
electrical components : 





Determine the modal lifetimes of the components. 


3. The following data gives the distribution of total monthly household expenditure of 200 
families of a village. Find the modal monthly expenditure of the families. Also, find the 
mean monthly expenditure : 
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4. The following distribution gives the state-wise teacher-student ratio in higher 
secondary schools of India. Find the mode and mean of this data. Interpret the two 
measures. 


Number of students per teacher Number of states /U.T. 





5. The given distribution shows the number of runs scored by some top batsmen of the 
world in one-day international cricket matches. 


3000 - 4000 
4000 - 5000 
5000 - 6000 
6000 - 7000 
7000 - 8000 
8000 - 9000 
9000 - 10000 
10000 - 11000 





Find the mode of the data. 


6. A student noted the number of cars passing through a spot on a road for 100 
periods each of 3 minutes and summarised it in the table given below. Find the mode 
of the data : 


Number of cars 10-20] 20-30 | 30-40] 40-50] 50-60] 60-70] 70-80 





reno || ofe efa 
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14.4 Median of Grouped Data 


As you have studied in Class IX, the median is a measure of central tendency which 
gives the value of the middle-most observation in the data. Recall that for finding the 
median of ungrouped data, we first arrange the data values of the observations in 


n+1 
ascending order. Then, if n is odd, the median is the =) th observation. And, if n 


is even, then the median will be the average of the 5th and the É + 7 th observations. 


Suppose, we have to find the median of the following data, which gives the 
marks, out of 50, obtained by 100 students in a test : 





First, we arrange the marks in ascending order and prepare a frequency table as 
follows : 


Table 14.9 
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Here n = 100, which is even. The median will be the average of the z and the 


E -+ 1 observations, 1.e., the 50th and Slst observations. To find these 


observations, we proceed as follows: 


Table 14.10 


Marks obtained Number of students 


6 
6+ 20 = 26 
26 +24 = 50 


50 + 28 = 78 
78+ 15=93 
93+4=97 
97+2=99 
99+ 1= 100 





Now we add another column depicting this information to the frequency table 
above and name it as cumulative frequency column. 


Table 14.11 


Marks obtained Number of students Cumulative frequency 


6 
20 
24 
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From the table above, we see that: 
50th observaton is 28 (Why?) 


51st observation is 29 


28 + 29 99 5 





So, Median = 


Remark : The part of Table 14.11 consisting Column 1 and Column 3 is known as 
Cumulative Frequency Table. The median marks 28.5 conveys the information that 
about 50% students obtained marks less than 28.5 and another 50% students obtained 
marks more than 28.5. 


Now, let us see how to obtain the median of grouped data, through the following 
situation. 


Consider a grouped frequency distribution of marks obtained, out of 100, by 53 
students, in a certain examination, as follows: 


Table 14.12 





From the table above, try to answer the following questions: 


How many students have scored marks less than 10? The answer is clearly 5. 
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How many students have scored less than 20 marks? Observe that the number 
of students who have scored less than 20 include the number of students who have 
scored marks from 0 - 10 as well as the number of students who have scored marks 
from 10 - 20. So, the total number of students with marks less than 20 is 5 + 3, 1.e., 8. 
We say that the cumulative frequency of the class 10-20 is 8. 


Similarly, we can compute the cumulative frequencies of the other classes, 1.e., 
the number of students with marks less than 30, less than 40, .. ., less than 100. We 
give them in Table 14.13 given below: 


Table 14.13 





The distribution given above is called the cumulative frequency distribution of 
the less than type. Here 10, 20, 30, ... 100, are the upper limits of the respective 
class intervals. 

We can similarly make the table for the number of students with scores, more 
than or equal to 0, more than or equal to 10, more than or equal to 20, and so on. From 
Table 14.12, we observe that all 53 students have scored marks more than or equal to 
0. Since there are 5 students scoring marks in the interval O - 10, this means that there 
are 53 — 5 = 48 students getting more than or equal to 10 marks. Continuing in the 
Same manner, we get the number of students scoring 20 or above as 48 — 3 = 45, 30 or 
above as 45 — 4 = 41, and so on, as shown in Table 14.14. 
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Table 14.14 


Marks obtained Number of students 
(Cumulative frequency) 


More than or equal to 0 58) 

More than or equal to 10 53 —-5 = 48 
More than or equal to 20 48-—3=45 
More than or equal to 30 45-4=4]1 


More than or equal to 40 41-3=38 
More than or equal to 50 38 -—3=35 
More than or equal to 60 35-4=31 
More than or equal to 70 31-7 =24 
More than or equal to 80 24-9=15 
More than or equal to 90 15-7= 8 





The table above is called a cumulative frequency distribution of the more 
than type. Here 0, 10, 20, . . ., 90 give the lower limits of the respective class intervals. 


Now, to find the median of grouped data, we can make use of any of these 
cumulative frequency distributions. 


Let us combine Tables 14.12 and 14.13 to get Table 14.15 given below: 
Table 14.15 


| Marks | Number of students (f) Cumulative frequency (cf) 





5 
3 
4 
3 
3 
4 
7 
9 
i 
8 


Now in a grouped data, we may not be able to find the middle observation by 
looking at the cumulative frequencies as the middle observation will be some value in 
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a class interval. It is, therefore, necessary to find the value inside a class that divides 
the whole distribution into two halves. But which class should this be? 


n 
To find this class, we find the cumulative frequencies of all the classes and z` 


We now locate the class whose cumulative frequency is greater than (and nearest to) 


n n 
z` This is called the median class. In the distribution above, n = 53. So, a =205. 


Now 60 — 70 is the class whose cumulative frequency 29 is greater than (and nearest 


ATE 
0) Fte, 5. 


Therefore, 60 — 70 is the median class. 


After finding the median class, we use the following formula for calculating the 
median. 





Median= |] +| 2 xh, 


where [= lower limit of median class, 
n= number of observations, 
cf = cumulative frequency of class preceding the median class, 
f= frequency of median class, 


h= class size (assuming class size to be equal). 


n 
Substituting the values a 26.5, 1 = 60, cf = 22, f=7,h=10 


in the formula above, we get 


Median= 60 + (7652) x 10 


60+ Z 
= a 


= 66.4 


So, about half the students have scored marks less than 66.4, and the other half have 
scored marks more than 66.4. 
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A survey regarding the heights (in cm) of 51 girls of Class X of a school 
was conducted and the following data was obtained: 


Height (in cm) Number of girls 
4 
1] 


Less than 140 
Less than 145 
Less than 150 29 
Less than 155 40 
Less than 160 
Less than 165 





Find the median height. 


To calculate the median height, we need to find the class intervals and their 
corresponding frequencies. 

The given distribution being of the less than type, 140, 145, 150, . . ., 165 give the 
upper limits of the corresponding class intervals. So, the classes should be below 140, 
140 - 145, 145 - 150, .. ., 160 - 165. Observe that from the given distribution, we find 
that there are 4 girls with height less than 140, 1.e., the frequency of class interval 
below 140 is 4. Now, there are 11 girls with heights less than 145 and 4 girls with 
height less than 140. Therefore, the number of girls with height in the interval 
140 - 145 is 11 — 4 = 7. Similarly, the frequency of 145 - 150 is 29 — 11 = 18, for 
150 - 155, it is 40 — 29 = 11, and so on. So, our frequency distribution table with the 
given cumulative frequencies becomes: 


Table 14.16 


Class intervals Cumulative frequency 


Below 140 
140-145 
145-150 


150-155 
155 - 160 
160 - 165 
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n 5l , , be 
. 5 25.5 . This observation lies in the class 145 - 150. Then, 
| (the lower limit) = 145, 


Now n=5l1. So, 


cf (the cumulative frequency of the class preceding 145 - 150) = 11, 
f (the frequency of the median class 145 - 150) = 18, 
h (the class size) = 5. 


——cf 





Using the formula, Median = / + xh, we have 


Median= 145+ (5) x5 


145 Vail 149.03 
= h jg 7 149.03. 


So, the median height of the girls is 149.03 cm. 

This means that the height of about 50% of the girls is less than this height, and 
50% are taller than this height. 
Example § : The median of the following data is 525. Find the values of x and y, if the 
total frequency is 100. 
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Solution : 





It is given that n = 100 

So, 764+x+y=100, i.e., x+y=24 (1) 
The median is 525, which lies in the class 500 — 600 

So, £=500, f=20, cf=36+x, A=106 


n ef 
2 





Using the formula : Median= / + h, we get 


525 = 500 + SS x 100 


les 525 — 500 = (14 - x) x 5 
LEs 25 = 70 —5x 

Les 5x = 70-25 = 45 
So, x=9 


Therefore, from (1), we get 9+y= 24 
Lë- y= 15 
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Now, that you have studied about all the three measures of central tendency, let 


us discuss which measure would be best suited for a particular requirement. 


The mean is the most frequently used measure of central tendency because it 
takes into account all the observations, and lies between the extremes, 1.e., the largest 
and the smallest observations of the entire data. It also enables us to compare two or 
more distributions. For example, by comparing the average (mean) results of students 
of different schools of a particular examination, we can conclude which school has a 


better performance. 


However, extreme values in the data affect the mean. For example, the mean of 
classes having frequencies more or less the same is a good representative of the data. 
But, if one class has frequency, say 2, and the five others have frequency 20, 25, 20, 
21, 18, then the mean will certainly not reflect the way the data behaves. So, in such 


cases, the mean is not a good representative of the data. 


In problems where individual observations are not important, and we wish to find 
out a ‘typical’ observation, the median is more appropriate, e.g., finding the typical 
productivity rate of workers, average wage in a country, etc. These are situations 
where extreme values may be there. So, rather than the mean, we take the median as 


a better measure of central tendency. 


In situations which require establishing the most frequent value or most popular 
item, the mode is the best choice, e.g., to find the most popular T.V. programme being 
watched, the consumer item in greatest demand, the colour of the vehicle used by 


most of the people, etc. 


Remarks : 


1. There is a empirical relationship between the three measures of central tendency : 
3 Median = Mode + 2 Mean 


2. The median of grouped data with unequal class sizes can also be calculated. However, 


we Shall not discuss it here. 
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1. The following frequency distribution gives the monthly consumption of electricity of 


68 consumers of a locality. Find the median, mean and mode of the data and compare 
them. 


65 - 85 
85 - 105 


105-125 


125-145 


145-165 


165-185 


185 -205 





3. A life insurance agent found the following data for distribution of ages of 100 policy 
holders. Calculate the median age, if policies are given only to persons having age 18 
years onwards but less than 60 year. 
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Age (in years) Number of policy holders 


Below 20 


Below 25 


Below 30 


Below 35 


Below 40 


Below 45 


Below 50 


Below 55 


Below 60 





4. The lengths of 40 leaves of a plant are measured correct to the nearest millimetre, and 
the data obtained is represented in the following table : 


Length (in mm) Number of leaves 


118 -126 
1274 135 


136 - 144 


145 - 153 


154 - 162 


163 -171 


172 -180 





Find the median length of the leaves. 


(Hint : The data needs to be converted to continuous classes for finding the median, 
since the formula assumes continuous classes. The classes then change to 
117.5 - 126.5, 126.5 - 135.5, ..., 171.5 - 180.5.) 
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5. The following table gives the distribution of the life time of 400 neon lamps : 





Find the median life time of a lamp. 


6. 100 surnames were randomly picked up from a local telephone directory and the 
frequency distribution of the number of letters in the English alphabets in the surnames 
was obtained as follows: 





Determine the median number of letters in the surnames. Find the mean number of 
letters in the surnames? Also, find the modal size of the surnames. 


7. The distribution below gives the weights of 30 students of a class. Find the median 
weight of the students. 





14.5 Graphical Representation of Cumulative Frequency Distribution 


As we all know, pictures speak better than words. A graphical representation helps us 
in understanding given data at a glance. In Class IX, we have represented the data 
through bar graphs, histograms and frequency polygons. Let us now represent a 
cumulative frequency distribution graphically. 


For example, let us consider the cumulative frequency distribution given in 
Table 14.13. 
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Recall that the values 10, 20, 30, T 
..., LOO are the upper limits of the ug 
respective class intervals. To represent 60 
the data in the table graphically, we mark 5, 50 
the upper limits of the classintervalson £ pi ‘Less than’ ogive — 
the horizontal axis (x-axis) and their 2 
corresponding cumulative frequencies E 30 
on the vertical axis ( y-axis), choosing a z 20 
convenient scale. The scale may notbe 5 10 
the same on both the axis. Let us now O 1020 30 40 50 60 70 80 90 100 
plot the points corresponding to the Upper limits—> 
ordered pairs given by (upper limit, 
corresponding cumulative frequency), Fig. 14.1 


1.e., (10, 5), (20, 8), (30, 12), (40, 15), 


(50, 18), (60, 22), (70, 29), (80, 38), (90, 45), (100, 53) on a graph paper and join them 
by a free hand smooth curve. The curve we get is called a cumulative frequency 
curve, or an ogive (of the less than type). (See Fig. 14.1) 





Next, again we consider the cumulative frequency distribution given in 
Table 14.14 and draw its ogive (of the more than type). 


Recall that, here 0, 10, 20,.. ., 90 
are the lower limits of the respective class 
intervals 0 - 10, 10 - 20,..., 90 - 100. To 
represent ‘the more than type’ graphically, 
we plot the lower limits on the x-axis and 
the corresponding cumulative frequencies 
on the y-axis. Then we plot the points 
(lower limit, corresponding cumulative 
frequency), 1.e., (0, 53), (10, 48), (20, 45), 
(30, 41), (40, 38), (50, 35), (60, 31), 
(70, 24), (80, 15), (90, 8), on a graph paper, 
and join them by a free hand smooth curve. 
The curve we get is a cumulative frequency curve, or an ogive (of the more than 
type). (See Fig. 14.2) 


© 


‘More than’ ogive 


© 


© © 


Cumulative frequency—> 
= NY UA AA 
© © 





O 10 20 30 40 50 60 70 80 90 100 
Lower limits ——~> 


Fig. 14.2 
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Remark : Note that both the ogives (in Fig. 14.1 and Fig. 14.2) correspond to the 
same data, which is given in Table 14.12. 


Now, are the ogives related to the median in any way? Is it possible to obtain the 
median from these two cumulative frequency curves corresponding to the data in 
Table 14.12? Let us see. 









T 
One obvious way is to locate G 60 
<P) 
=z 50 
— 33 = 26.5 on the y-axis (see Fig. = 
2 2 & 40 
14.3). From this point, draw a line parallel È 30T a. 
to the x-axis cutting the curve at a point. = 20 
. i = : Median (66.4) 
From this point, draw a perpendicular to = 10 We 
the x-axis. The point of intersection of Yo 
10 20 30 40 50 60 70 80 90 100 
this perpendicular with the x-axis ae 
Upper limits ——> 
determines the median of the data (see 
Fig. 14.3). Fig. 14.3 
Another way of obtaining the T 
median is the following : = 60 
s 50 
Draw both ogives (i.e., of the less A 40 
than type and of the more than type) on 2 30 
the same axis. The two ogives will $ 20 
intersect each other at a point. From this = 
a of i = 10 
point, if we draw a perpendicular on the A i 
x-axis, the point at which it cuts the O 1020 30 40 50 60770 80 90 100 
x-axis gives us the median (see Fig. 14.4). Median (66.4) 


Fig. 14.4 
Example 9 : The annual profits earned by 30 shops of a shopping complex in a 
locality give rise to the following distribution : 
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poe 


Draw both ogives for the data above. 
Hence obtain the median profit. 


Solution : We first draw the coordinate 
axes, with lower limits of the profit along 
the horizontal axis, and the cumulative 
frequency along the vertical axes. Then, 
we plot the points (5, 30), (10, 28), (15, 16), 
(20, 14), (25, 10), (30, 7) and (35, 3). We 
join these points with a smooth curve to 
get the ‘more than’ ogive, as shown in 
Fig. 14.5. 


Now, let us obtain the classes, their 
frequencies and the cumulative frequency 
from the table above. 


Table 
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> 

Q 

5 50 

5 

3 40 

= 30 

= 

E 20 

=| 

zZ 10 

>= o 

Q 10 20 30 40 50 

Lower limits of profit —> 
(Rs in lakhs) 
Fig. 14.5 
14:17 





Using these values, we plot the points 
(10, 2), (15, 14), (20, 16), (25, 20), (30, 23), 
(35, 27), (40, 30) on the same axes as in 
Fig. 14.5 to get the ‘less than’ ogive, as 
shown in Fig. 14.6. 


The abcissa of their point of intersection is 
nearly 17.5, which is the median. This can 
also be verified by using the formula. 
Hence, the median profit (in lakhs) is 
z 17.5. 


Remark : In the above examples, it may 
be noted that the class intervals were 
continuous. For drawing ogives, it should 
be ensured that the class intervals are 
continuous. (Also see constructions of 
histograms in Class IX) 


Nn 
© 







‘More than’ ogive 


E 
© 


nN 
© 


‘Less than’ ogive 


Cumulative frequency ————> 
= Ua 
© © 


10 fzo 30 40 50 
Median (17.5) 
Profit (Rs in lakhs) ————> 


Fig. 14.6 
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EXERCISE 14.4 


1. The following distribution gives the daily income of 50 workers of a factory. 





Convert the distribution above to a less than type cumulative frequency distribution, 
and draw its ogive. 


2. During the medical check-up of 35 students of a class, their weights were recorded as 
follows: 





Draw a less than type ogive for the given data. Hence obtain the median weight from 
the graph and verify the result by using the formula. 


3. The following table gives production yield per hectare of wheat of 100 farms of a village. 





Change the distribution to a more than type distribution, and draw its ogive. 


14.6 Summary 

In this chapter, you have studied the following points: 
1. The mean for grouped data can be found by : 

LX, 


(i) the direct method: x = 
Li 
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rfid; 
Lf 


(ii) the assumed mean method: x =a + 


Èfu. 
(1) the step deviation method : x =a + ie) xh, 


i 
with the assumption that the frequency of a class is centred at its mid-point, called its 
class mark. 


2. The mode for grouped data can be found by using the formula: 


Mode= 14| ih 
2 fi — fo = 5 


where symbols have their usual meanings. 


3. The cumulative frequency of a class is the frequency obtained by adding the frequencies 
of all the classes preceding the given class. 


4. The median for grouped data is formed by using the formula: 


n 
— —cf 
Median = l + 2 7 xh 





9 


where symbols have their usual meanings. 


5. Representing a cumulative frequency distribution graphically as a cumulative frequency 
curve, or an ogive of the less than type and of the more than type. 


6. The median of grouped data can be obtained graphically as the x-coordinate of the point 
of intersection of the two ogives for this data. 


A NOTE TO THE READER 


For calculating mode and median for grouped data, it should be 


ensured that the class intervals are continuous before applying the 
formulae. Same condition also apply for construction of an ogive. 
Further, in case of ogives, the scale may not be the same on both the axes. 
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